ANALYSIS  OF  STATIONARY  TIME  SERIES 


by 


GAIL  EUGENE  BACHKA.N 
B.  A.,  University  of  Wichita,  1959 


A  MASTER'S  REPORT 


submitted  in  partial  fulfillment  of  the 


requirements  for  the  degree 


MASTER  OF  SCIENCE 


Department  of  Statistics 


KANSAS  STATE  UNIVERSITY 
Manhattan,  Kansas 

1963 


Approved  by: 


Major  Professor^/ 


G^Dp,^  TABLE  OF  CONTENTS 

\  PAGE 

BACKGROUND  OF  THE  TIME  SERIES  PROBLEM  1 

NATURE  OF  THE  TIME  SERIES  PROBLEM 2 

Stationary  Time  Series  , , 2 

Models  3 

Testing  a  Series  for  Autocorrelation  h 

Testing  for  Autocorrelation  in  Residuals  6 

The  Correlogram  9 

THE  AUTOREGRESSIVE  MODEL 12 

Model  with  Lagged  Dependent  Variable I3 

Model  with  Autocorrelated  Errors  I8 

PARAMETRIC  TIME  SERIES  23 

Variate  Difference  Method  , , ,  2^ 

Oscillatory  and  Periodic  Movements 27 

SUl'fl^ARY  AND  CONCLUSIONS  38 

ACKNOWLEDGMENT  39 

REFERENCES 1^0 

APPEITDIX 1^.3 


BACKGROIM)  OF  THE  TliME  SERIES   PROBLEM 

The  usual  model  in  least  squares  regression  analysis  is 

r 
Xt=Po+JiPi%t+^t»   t=0,l,...,n-l, 

v/here  the  Z's  are  assumed  fixed  in  repeated  sampling  and  the  e's 

2 

are  independently  distributed  with  mean  zero  and  variance  a  ,     In 

analysis  of  variance  data  the  Z's  may  be  merely  dummy  variates 
with  values  0  or  1.  VJhen  tests  of  significance  or  confidence  lim- 
its are  desired,  normality  of  the  6's  is  also  assumed. 

In  the  sciences  many  problems  occur  in  which  a  process  pro- 
duces what  may  be  considered  a  family  of  random  variables  such 
that  there  is  a  value  of  x^  for  each  value  of  t  in  some  interval 
T.  The  experimenter  wishes  to  investigate  the  nature  of  the  re- 
sponse curve  over  the  interval  T.  One  of  the  major  difficulties 
in  the  application  of  traditional  statistical  methods  to  these 
time  series  data  is  the  possible  absence  of  independence  of  suc- 
cessive observations.  If  the  €'s  are  not  independent  the  assump- 
tions necessary  for  using  ordinary  least  squares  estimation  theory 
are  violated.  It  is  the  correlation  of  the  €'s  and  not  of  the  X's 
which  is  to  be  avoided.  Attitudes  of  research  workers  toward  re- 
gression analysis  of  time  series  have  varied  between  widely  sep- 
arated extremes.  Until  the  middle  1920' s,  many  researchers  v^ere 
completely  unaware  of  the  problems  connected  with  the  sampling  of 
time  series.  Following  the  appearance  of  articles  such  as  Yule's 
(1926)  on  "non-sense  correlations,"  it  was  maintained  that  ex- 
isting methods  simply  did  not  apply  to  time  series  and  that 


reputable  statisticians  should  leave  time  series  alone.  Koopmans, 
Wold  and  others  clarified  the  sampling  significance  of  regression 
analysis  based  on  time  series  in  the  late  1930* s.  Considerable 
work  followed  on  the  problem  of  testing  for  the  existence  of  cor- 
relation of  the  errors  but  all  too  little  on  the  more  important 
problem  of  the  best  estimation  procedure  when  the  correlations  do 
exist.  Results  are  still  somewhat  lacking  in  this  latter  area, 
but  several  estimation  procedures  have  been  proposed  since  195"0, 
some  by  social  and  natural  scientists,  and  others  by  physical  sci- 
entists and  engineers.  The  method  of  spectrum  analysis  is  most 
prominent  in  the  latter  category.  This  paper  will  deal  primarily 
with  those  methods  generally  used  in  the  social  and  biological 
sciences, 

NATURE  OF  THE  TIME  SERIES  PROBLEM 

Stationary  Time  Series 

The  discussion  of  time  series  is  usually  confined  to  what 
are  called  stationary  processes  or  stationary  time  series.  There 
are  two  important  types  of  stationarity,  A  process  is  called 
strictly  stationary  if  the  distribution  of  the  set 

(xt^,...,  x^^) 

of  random  variables  from  (x^it^T)  is  the  same  as  that  of  the  set 

for  every  n,  t^,  t2,...,  t^  and  h.  This  roughly  means  that  the 
time  series  is  without  trends,  not  only  in  the  mean  values  of  the 


x^  but  also  in  their  variances.  Most  of  the  studies  in  time 
series  do  not  require  the  assumption  of  strict  stationarity  but 
are  based  on  the  weaker  assumptions  that:  (1)  E(Xj.)  is  a  constant 
for  all  t  which  may  be  taken  as  zero,  and  (2)  the  distributions 
above  have  the  same  covariance  matrix  for  all  h.  k   time  series 
(xj.:t^T)  is  said  to  be  weakly  stationary  if  it  satisfies  these  two 
conditions.  Hence  the  covariance  matrix  depends  only  upon  the 
time  differences 

^2"^l5  ■t2-t2)...,  "t^n'^n-l? 
and  the  covariance  of  x^^j^  and  x^  is  a  function  of  h  only.   If 
E(x^)  is  taken  to  be  zero,  then  E(x^x^+j^)=Yj^.  The  covariance  Yj^ 
is  usually  called  the  autocovariance  between  x^  and  x^+^,  and  p^ 
is  called  the  autocorrelation  function  of  lag  h. 

For  some  time  series  (y^:teT)  the  model  will  have  the  form 

yt=mt+xt 
where  m^.  is  a  constant  for  each  t  and  (x.t:teT)  is  a  stationary 
time  series  with  E(xj.)=0.  Since  E(y^)=m^,  the  covariance  function 
of  (y^:t€T) 

V^  [^^t+h-^+h )  (yt-°4: )]  =2 (x^+hXt ) 

is  identical  with  that  of  (x^:t^T).  Estimating  m^  and  r^   from  a 
finite  number  (n)  of  observations  taken  from  the  time  series  is 
one  of  the  problems  of  time  series  analysis. 

Models 

There  are  a  number  of  models  which  may  be  used  in  analyzing 
time  series.  If  it  is  assumed  that  the  data  follow  an  underlying 


systematic  scheme  with  random  fluctuations  superimposed,  the  meth- 
ods of  harmonic  analysis  and  period ogram  analysis  may  be  used  to 
determine  the  nature  of  the  systematic  component  for  functions 
with  regular  periods.  In  cases  where  the  periods  are  known,  for 
example,  seasonal  variation  studies  in  economics,  harmonic  anal- 
ysis is  used  to  determine  the  amplitudes.  VJhere  the  periods  are 
regular  but  unknown,  periodogram  analysis  can  be  used  to  seek  out 
the  hidden  periodicities.   If  the  systematic  movement  is  oscilla- 
tory v/ith  irregular  periods,  the  variables  Z..  in  the  regular  re- 
gression model  may  become  t^,  and  a  polynomial  form  used  to  locally 
describe  the  systematic  component. 

In  other  cases,  the  assumed  model  may  involve  lagged  values 
of  X  as  predictors.  An  example  of  this  autoregressive  model  might 

be 

r 
Xt=Po+i|-j_PiXt-l+€t. 

Or  finally,  a  combined  regression  model  could  be  used  with  lagged 
X's,  present  Z's  and  lagged  Z*s  as  predictors. 

The  choice  between  models  is  very  difficult.   It  may  happen 
that  one  model  fits  v;ell  and  the  others  rather  poorly.  For  short 
series  it  is  usually  impossible  to  determine  whether  this  phenom- 
enon is  due  to  the  choice  of  the  model  or  to  the  particularities 
of  the  sample  analyzed.  Particularly  for  the  autoregressive 
schemes,  tests  for  goodness  of  fit  are  not  well  developed. 

Testing  a  Series  for  Autocorrelation 

Suppose  (x-j_,...,  x^)  is  a  sample  from  a  normal  time  series 


(X{.)  and  the  hypothesis  to  be  tested  is  that  the  time  series 
Xq^,...,  Xjj  are  independent  random  variables  having  identical  nor- 
mal  distributions  N(;i,o  ).  The  term  white  noise  is  often  used  in 
reference  to  such  independent  random  variables. 

R,  L.  Anderson  (19^2)  proposed  a  criterion  for  testing  this 
hypothesis  with  the  ratio 

Rl=ci/c' 
where  a^c^=  2  (X|-x)(x^^j^-S),  h=0,l5  and  2^+i=x-j_.  Use  of  the  rela- 
tion Xj^+i=X]_,  as  opposed  to  running  the  summation  from  t=l  to  n-1, 
is  somewhat  arbitrary,  but  it  simplifies  the  distribution  theory 
of  Rj^.  If  a  sample  is  from  a  white  noise,  then  with  n  large,  Ri 
will  tend  to  have  values  near  0,  If  the  sample  is  not  from  a 
white  noise,  then  R-^  will  tend  to  have  values  away  from  0, 

Anderson  derived  the  sampling  distribution  for  R^  and  has 
prepared  tables  for  Pr[Ri>Ri(c()]=c(,  for  c<f0.99,  0,95,  0.05,  0.01, 
and  n=5(l)15(5)75.  Values  of  R^  for  lag  other  than  1  may  be 
tested  using  the  table  for  R]_,  since  for  large  samples  R^  is  ap- 
proximately distributed  like  R-j_.  For  large  n,  Anderson  also 
showed  that  R^  is  approximately  normally  distributed  v/ith  mean 
-l/(n-l)  and  variance  (n-2)/(n-l)^. 

Koopmans  (19^2)  examined  R^  as  an  estimate  of /3  in  the  simple 
autoregressive  model 

V/^^-l-^Sf 
The  circular  definition  of  R^  was  not  satisfactory  if  the  alterna- 
tive hypothesis  specified  this  form.  Von  Neumann  (19^1)  had  ear- 
lier obtained  the  distribution  of 


c2    -.     n-1        ^  ,  _  n      5 

Hart  (19^2)  tabulated  the  probabilities  by  use  of  a  series  approx- 
imation. T.  W.  Ajiderson  (195^)  then  showed  that 

n-1 


r     2       2"i   ^"^ 
l/2[(x3L-x)'^  +(Xj^-x)'^J  +  2  (xt-x)(x^+i-x) 

n 

Z  (x^-x)' 

=1  ^ 


> ., ^-- ^- ■nirnim  -ii'i-  -  — i -r^ "  V"-  ' — i  fi-^-r-"  nmi— i — irtir-nT-i 


^      .-.^2 


had  greater  power  than  R-j^  in  testing  the  hypothesis  that  yD=0  for 
Koopmans'  model  and  that  no  uniformly  most  powerful  test  exists 
for  such  a  hypothesis.  Since 

%=  2n(l-Rc)/n-l, 

Anderson  was  able  to  transform  Hart's  significance  levels  into 
significance  levels  for  R^. 

A  non-parametric  test  of  great  simplicity  is  due  to  Wald  and 
Wolfowitz  (19^3) J  but  it  is  also  circularly  defined  and  somewhat 
limited  in  use.  Many  other  papers  on  testing  for  autocorrelation 
have  appeared;  those  mentioned  here  are  probably  the  most  signif- 
icant. 

Testing  for  Autocorrelation  in  Residuals 

In  the  previous  section  on  testing  for  autocorrelation  the 
process  (xj.)  considered  had  mean  value  zero  and,  if  necessary, 
the  mean  correction  was  applied.  The  general  class  of  time  proc- 
esses v/ith  which  the  experimenter  is  usually  concerned  will  need 
to  be  reduced  to  stationary  form  by  simple  subtraction  of  a  time 


dependent  mean.  Such  a  preliminary  treatment  of  data  will  nearly 
always  be  necessary  before  methods  of  stationary  time  series  can 
be  applied. 

Consider  the  linear  regression  of  a  variable  y^  upon  k  re- 

gressor  variables  ^ti^2t^"' ^'^'^f     ^^®^  ^^t  ^^  regarded  as  being 
generated  by  a  relation  of  the  form 


k 


^t  ^^o^iiPi^t-^^ 

where  x^  is  generated  by  a  stationary  process.  The  Z^.   are  con- 
sidered fixed  and  independent  of  the  x^  and  inferences  made  condi- 
tionally upon  the  fixing  of  the  Z^^  at  their  observed  values. 

It  has  been  shown,  as  shall  be  more  fully  illustrated  in  a 
later  section,  that  the  departure  of  the  process  generating  x±. 
from  a  process  generating  independent  random  variables  may  effect 
both  the  efficiency  of  the  least  squares  methods  and  the  validity 
of  the  usual  tests  of  significance.  Lacking  any  precise  prior 
knowledge  as  to  the  nature  of  the  data,  a  reasonable  procedure 
may  be  to  carry  out  an  initial  regression  based  on  the  assumption 
that  the  x^'s  are  white  noise.  The  x^  can  then  be  tested  for 
mutual  independence.  The  fact  that  the  p's  are  estimated  invali- 
dates the  use  of  the  methods  of  detecting  autocorrelation  pre- 
sented in  the  previous  section. 

A  small  sample  test  of  the  null  hypothesis  that  the  x^  are 
independent  and  normal  with  zero  mean  is  due  to  Durbin  and  Watson 
(1950,  1951).  Let  the  n  successive  least  squares  residuals  be 
^l>^2»*'*'^n*  ^  modification  of  the  von  Neumann  statistic 


8 

n-l         ^ 

^  =  T-; 

i=l  1 

is  used  to  test  for  the  existence  of  autocorrelation  in  the  resid- 
uals. It  \vill  be  noted  that 

6£  ,  nd  ,  and  d=2(l-R  ) 
s2  n-l 

but  since  the  original  von  Neumann  and  T.  W.  Anderson  statistics 
did  not  refer  to  the  residuals  from  a  regression  analysis,  tables 
for  those  statistics  cannot  be  used  here.  An  exact  distribution 
for  d  cannot  be  evaluated,  but  upper  and  lower  significance 
bounds,  d^  and  d^,  could  be  computed.  This  was  done  by  Durbin 
and  Watson  for  51,  2. 5%,   and  1%  one-tailed  tests,  for  n=15(l)^■0 
(5)100  and  for  k=l(l)5.  It  should  be  noted  that  d^  and  dj^  will 
diverge  as  k  increases  and  also  as  n  increases. 

In  most  cases  the  experimenter  desires  a  test  of  the  null  hy- 
pothesis against  the  alternative  of  positive  correlation.  The  ex- 
pected value  of  d  will  be  small  when  the  null  hypothesis  is  false, 
so  if  the  computed  value  of  d  is  less  than  the  tabulated  value  the 
null  hypothesis  is  rejected.  If  the  alternative  hypothesis  was 
negative  correlation,  d  would  be  expected  to  be  near  k.     In  this 
case  d'=if-d  is  considered  and  tested  against  the  tabulated  value 
as  above.  Durbin  and  V/atson  present  alternative  approximation 
methods  for  use  when  n  is  greater  than  ^-0, 

Moran  (1950)  presented  an  exact  test  for  the  residuals  from 
regression  when  only  one  predictor  is  used.  He  used  the  first  two 


autocorrelation  coefficients  of  the  residuals,  defined  in  a  cir- 
cular fashion,  and  showed  that  the  expected  value  of  the  autocor- 
relation coefficient  of  the  residuals  r-,  is 


and  that 


^.  .   -(1+%) 
^^^1^  =  -WIT- 


2.        N+1        2Ri+3Rf-2R2 
ECrn )  =  . — -  -f-  — ± ± ^ 

N^      N(N-2) 


Finally  Koran  shows  that  for  large  samples  the  quantity 


r^-E(r3_) 


is  normally  distributed  with  mean  zero  and  variance  one. 

The  Correlogram 

A  useful  tool  in  the  analysis  of  time  series,  first  proposed 
by  V/old,  is  called  the  correlogram.  V/old  (1953)  indicated  that 
the  choice  of  possible  models  used  to  explain  stationary  time 
series  data  depended  upon  the  relationship  of  successive  true 
autocorrelation  coefficients  p-^.     The   sample  values  R-^  are  usually 
displayed  graphically  as  in  Fig.  1. 

Three  possible  forms  of  the  correlogram  are  readily  apparent. 
First  the  curve  may  be  strictly  periodic  with  repeated  non-damped 
cycles.  This  suggests  the  use  of  harmonic  analysis.  Secondly, 
the  curve  may  be  damped  but  with  |^j  greater  than  zero.  This  type 
of  curve  may  be  generated  by  a  linear  autoregressive  model.  The 
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Fig.  1.  Correlogram, 


third  alternative  is  a  damped  correlogram  with  p  eqioal  to  0  for 
some  L  greater  than  m.  Wold  suggests  the  use  of  moving  averages 
to  transform  the  data  to  non-autocorrelated  observations. 

The  extent  to  which  the  fine  structure  of  a  correlogram  can 
be  interpreted  seems  limited  and  it  appears  best  to  concentrate  on 
certain  features  such  as  pronounced  oscillations  and  the  speed 
with  which  the  R^  converge  to  zero.  Bartlett  (19^6)  has  shown 
that  successive  autocorrelation  coefficients  tend  to  be  autocor- 
related  and  hence  caution  should  be  used  in  determining  the  model 
from  the  correlogram.  Especially  with  relatively  short  time 
series,  the  empirical  .correlogram  may  depend  more  upon  the  prop- 
erties of  the  sample  than  upon  the  population,  but  it  still  is  a  . 
valuable  tool  in  selecting  a  suitable  model. 

To  illustrate  these  concepts,  precipitation  data  for  Manhat- 
tan, Kansas,  was  obtained  and  the  probability  of  a  dry  day  was 
calculated  for  each  day  of  the  year.  These  probabilities  are 
listed  in  the  Appendix.  Now  if  it  is  assumed  that  the  observed 
probabilities  are  the  result  of  random  variation  superimposed  on 
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a  systematic  model,  the  correlogram  for  the  3^5  observations  may 
give  an  indication  of  the  appropriate  model. 

The  autocorrelation  coefficient  (circular-definition)  for 
possible  lags  are  listed  in  Table  1  and  plotted  in  Fig.  2.  Al- 
though care  must  be  taken  in  drawing  conclusions  from  the  correlo- 
gram, the  oscillation  suggests  that  either  an  autoregressive  model 
or  a  periodic  model  be  used  to  estimate  the  systematic  component 
of  the  data.  It  will  be  shoum  later  that  harmonic  analysis  used 
to  fit  a  periodic  function  gives  errors  which  appear  random. 


Table  1.  Coefficients  of  Autocorrelation  for  Probabilities 
of  Dry  Days  in  Manhattan,  Kansas. 


Lag     Rj^     Lag     R^     Lag     Rj^     Lag     Rj^ 


1  .6711  95  -.0899  190  -.9^05  285  .0825 

5  .6313  100  -.1065  195  -.5017  290  .1317 

10  .621^6  105  -.1557  200  -.51^3  295  .1811 

15  .5932  110  -.1^37  205  -.M-711  300  .1860 

20  .5712  115  -.2766  210  -.^5^3  305  .2565 

25  .5189  120  -.2720  215  -.^53^  310  .2893 

30  .5157  125  -.2869  220  -.M4-52  315  .3if21 

35  .^6i4-5  130  -.33if2  225  -.if035  320  .3if79 

4o  .if.i2if  135  -.3775  230  -.3775  325  .'+123 

^5  .3^-79  1^0  -.4035  235  -.33^2  330  ,h-6k-5 

50  .3^21  ik5  -.4452  2i;o  ..2g69  335  5156 

P.  'Bli  ^^2  -^^33  2if5  -.2721  3^  :5i88 

60  .2566  155  -.^543  250  -.2766  3^5  .5712 

6^  .i860  160  -.lf710  255  -.1^37  350  .^9^2 

70  .1811  165  -.51^2  260  -.1558  355  625;6 

75  .1317  170  -.5017  265  -.1065  360  .6313 

80  .0825  175  -.5^05  270  -.0899 

85  .0221  180  -.5310  275  -.0295 

90  -.0295  185  -.5310  280  .0221 
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Fig.  2.  Correlogram  for  Probabilities  of  Dry  Days. 

THE  AUTOREGRESSIVE  MODEL 

In  many  fields  of  study  the  time  series  phenomena  may  be  rep- 
resented by  a  regression  model  of  the  form 

Wt-l"'-  •  --^ Vt-P=^l^f*'-  •  •■••Pq^qt^^t ' 


t-0,1, , , . ,n-l. 


(1) 


where  ^^  is  a  series  of  independently  and  identically  distributed 
random  variables  with  mean  zero  and  variance  0^,     This  is  a  gener- 
alization of  both  the  ordinary  regression  model 

VPAt-^'-'-^PqV^t 
and  of  the  autoregression  model 

^t+'=<l^-l+---+°^^-p+^o=€t  .  • 

Values  of  x^,  x_-],,...,  x_p^.]_  are  usually  regarded  as  given  num- 
bers, or  if  they  are  considered  as  random  variables,  inferences 
are  made  conditionally  on  those  quantities  held  fixed. 
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Since  the  coefficients  in  the  normal  equations  are  random 
variables  for  model  (1),  rather  than  constants  as  for  ordinary 
regression  models,  difficulties  arise  in  finding  the  sampling 
distributions  of  the  least-squares  estimators. 

Model  with  Lagged  Dependent  Variable 

Mann  and  Wald  (19^3)  studied  the  autor egression  model 

^t"*"'^l^-l'*""*'*"=^^-p'^^o~^t  ^^^  showed  that  ordinary  least  squares 
theory  is  valid  as3miptotically. 

A  method  presented  by  Durbin  (I960)  examines  the  properties 
of  estimators  for  the  model  containing  lagged  x's.  He  considered 
the  simplest  cases  of  the  regular  regression  model,  namely, 

x^=pZt+€t,  t=l,...,n 
where  Z^^,.,.,  Z^  are  constants,  and 

Xt+'=<Xt_i=6t,  t=l,...,n 
where  x^  is  constant.  Both  cases  take  ^t  to  be  independently  and 
identically  distributed  with  mean  zero  and  variance  C5^. 

Application  of  least  squares  gives  estimates  for  p  and  «=(  of 
the  form 

b=JiXtZt/J^z2  and  a-J^XtX^.i/J^x^.i 

respectively.  The  estimate  b  is  the  minimum-variance  unbiased 
estimator  for  p,  but  a  is  biased  and  its  small  sample  properties 
do  not  follow  directly  from  any  classical  theory.  The  difference 
arises  from  the  fact  that  while  b  is  a  linear  function  of  the  x's 
and  relatively  easy  to  handle,  a  is  a  ratio  of  quadratic  forms. 

In  developing  a  reasonable  optimality  criterion  for  estimating 


-^'v-'^t^"  "  ?*"'"i*  • 


T  -^  r  '  X 


Ik 


o(,  Durbin  considers  the  estimating  equation 

nan 
t=l  ^  -^  t=l  ^  ^  -^ 

from  which  a  is  derived.  If  a  is  replaced  by  «=(  then 

The  linear  equation  in  a  is  called  an  unbiased  estimating  equation 

in  accordance  with  a  definition  by  Durbin:  ' 

Suppose  that  the  estimator  a  of  a  parameter  oc  is 
given  by  the  linear  equation 

Tia+T2=0,  (2) 

where  T-,   and  Tp  are  functions  of  the  observations  such 
that  T2/T1  is  independent  of  unknown  parameters,  and 
where 

E(Tio<+T2)=0.  (2) 

Then  equation  (2)  is  called  an  unbiased  linear  estimating 
equation. 

Linear  in  this  case  means  linear  in  a,  not  linear  in  the  ob- 
servations. The  quantity  T3_  is  assumed  to  be  non-zero.  If  T=l, 
this  definition  includes  the  ordinary  notion  of  an  unbiased  esti- 
mator . 

A  second  definition  is  necessary  in  requiring  the  analogue 
of  minimum  variance  of  an  unbiased  estimator.  Since  (2)  may  be 
multiplied  through  by  an  arbitrary  constant  without  affecting  the 
value  of  a,  requiring  T-lo<+T2  to  have  minimum  variance  is  not 
enough.  To  take  care  of  this  situation,  the  equation  was  stand- 
ardized by  dividing  through  by  E(Ti). 

If  ■ti=Ti/£(ri,^)  and  t2=T2/E(T^)  "then  Durbin' s  second  defini- 
tion is: 

Suppose  that  tia+t2=0  is  an  unbiased  linear  estimating 
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equation  where  E(t-j^)=l  and 

V(tic<+t2)^V(tio<+t^),  (3) 

for  all  other  unbiased  linear  estimating  equations 
tj_a+t^=0  having  E(tj_)=l.  Then  tia+t2=0  is  called 
a  hest  unbiased  linear  estimating  equation. 

The  notion  of  a  minimum  variance  unbiased  estimator  is  in- 
cluded in  this  definition,  for  if  ti=ti=l  and  (2)  is  satisfied,  a 
is  a  minimum  variance  unbiased  estimator  of  c<. 

Now  a  lo\-:er  bound  for  the  variance  of  t]_o(+t2  is  derived.  Let 
T-[_a+T2=0  be  an  unbiased  estimating  equation  where  T-j_  and  T2  depend 
only  on  the  observations.   If  the  sample  density  is  jjJ(x]_,,.., 
Xj^;c<),  then  from  (3) 

^   (T3_c<+T2)  jz5  dx=0 
where  Tp^  denotes  the  multiple  integral  and  dx  stands  for  dX]_»«»»j 
dXjj,  If  the  conditions  for  differentiating  under  the  integral 
sign  are  satisfied,  the  differentiation  with  respect  to  <=<  gives 

f^  X.-Ji   dx  +  y^(txo<+t2)(^2J/3°<)dx=0, 

Since  E(ti)=/1  ti^  dx?=l  and  ^jzJ/5cc=jZ$  blog  ^/dc<,  one  may  write 

/^  (t2=<+t2)(^log  j2J/3c<)^dx=-l. 

By  Schwarz's  inequality 


E(t-i_o(+t2)^  EC^log  izJ/do<)^^  y^(tio(+t2)(^log  ^/3o()^dx 


^=1, 


so  that  finally 

V(tic<+t2):?|-(^log  ^/5o()^      EC^'^log  j5/3c<'^)         ^^^ 
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If  t^sl  then  -t2  is  an  unbiased  estimator  and  ik-)   becomes  the 
Cramer -Hao  inequality 

E(t2+c<)S 


for  the  lov/er  bound  on  the  variance  of  an  unbiased  estimator. 

Quantity  tj^   may  not  be  identically  equal  to  one  but  may  con- 
verge stochastically  to  one  as  n — >  oo.  In  this  case  let  h^   be  a 
function  of  n  such  that 

E(blog  ^/ac<)2=o(S^). 

It  follov/s  that  since  t-j_(a-o()=-(t3_c(+t2) ,  the  limiting  min- 
imum variance  is  that  found  above.  Thus  the  asymptotic  distribu- 
tion of  6n(a-c(),  if  it  exists,  has  mean  zero  and  the  limiting  min- 
imum variance  is 


lim 


si 


n— ^00  ECblog   ^/dcc)'^ 
For  single  parameter  problems  o^^  is  usually  equal  to  v/n". 

A.S  an  example,  the  errors  €]_,...,  e^  for  the  autoregressive 
model  can  be  assumed  normally  distributed  with  unit  variance. 
The  density  function  is  then 

^         .21 


(27r) 


n/2 


exp  -1/2  2  (x^+coc^^j^y 


Application  of  the  method  of  maximum  likelihood  gives  the  linear 
estimating  equation,  namely, 

n  2    ^ 


^tli'^-i''t=i''t^t-i=o» 
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which  is  unbiased  according  to  the  previous  definition. 
Differentiation  leads  to 


-d%og  ^/dc(^=  2  x|^3^, 


no         n 


so  from  the  previous  derivation,  with  T-j_=  2  x^_i  and  T2=  2  x^Xj-^-j^ 

t— J.  t— 1 

one  obtains 

V(t^o(+t2)  ^  . 


^(  Jl  <i) 


For  this  example  this  lower  bound  is  actually  attained  as 

.   n   p 
t,c<+t5  =^bloR  6  /E(  2  xf  i) 
dc<   /   t=l  ^-•'■ 


and 


V(t,c(+t  )  =  E(c)log  qV6o()^  =  2l. 


[=(J,-f-i)]'   E(Ji4-i) 


To  consider  the  variance  of  the  estimator  a,  asymptotic 
theory  and  the  assumption  |  c<  |  <  1  are  used.  The  expected  value, 

n   2        1 
1  E(  2  yit  . )— ^  J_ 
n   t=l  t-1     •:^Z? 

as  n  becomes  large,  so  that  in  the  limit  v/n(a-c<)  will  have  zero 
mean  and  variance  l-o(^.  Thus  a  is  an  asymptotically  efficient 
estimator  of  oc  since  l-cc  is  the  minimum  variance  possible. 

Durbin  extends  this  proof  to  multi-parameter  problems  and 
shows  that,  in  general,  the  same  properties  hold.  These  results 
are  important  for  the  next  section  where  the  model  has  autocorre- 
lated  errors. 
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Hurwicz  (195^0)  studied  the  small  sample  bias  of  the  parameters 
in  autoregressive  models  and  indicated  the  serious  proportions  that 
the  bias  may  take  on. 

Model  with  Autocorrelated  Errors 

For  many  situations,  the  appropriate  model  is 

VPl\t"*'"*''PqV*"^t'  (t  =0>--«>n-l)  (5) 

where  (U^)  is  a  stationary  autoregressive  series  given  by 

Ut+=<lUt-l+**«+"=^Ut_p=6t)  (t  = 1,0,1,...)         (5a) 

and  where  the  Z's  are  given  constants.  The  ^^'s  are  assumed  to 
be  independently  and  identically  distributed  with  mean  zero  and 
variance  c5^.  This  model  differs  from  the  model  of  the  previous 
section  in  that  it  does  not  contain  lagged  x's  and  has  autocorre- 
lated  error  terms. 

The  common  assumption  of  independence  of  error  terms  may  be 
violated  in  data  such  as  a  series  of  outputs  of  a  production  proc- 
ess. Cochrane  and  Orcutt  (19^9)  have  offered  three  reasons  why 
the  6^'s  in  economic  time  models  tend  to  be  autocorrelated: 

1.  Use  of  incorrect  functional  form  of  the  relationship. 

2.  Omitted  variables  are  usually  autocorrelated. 

3.  Errors  of  measurement  are  often  autocorrelated. 

They  conducted  some  empirical  sampling  studies  using  generated 
autoregressive  error  processes  with  a  given  regression  model.  The 
series  used  were  analogous  in  length  to  most  available  economic 
time  series  with  approximately  twenty  observations.  When  least 
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squares  regression  was  used  to  analyze  the  generated  series,  the 
results  indicated: 

1.  Estimated  autocorrelation  of  the  residuals  tended 

to  be  biased  tov/ard  randomness. 

2.  The  least  squares  estimates  are  not  biased  even  though 

they  are  not  the  best  estimates. 

3.  VJhen  the  autocorrelation  of  the  errors  is  high  the 

variance  of  the  least  squares  estimates  is  greatly 
increased. 
k.     Nearly  optimum  results  can  be  achieved  if  the  error 
term  is  only  a  rough  approximation  to  a  random 
series  so  even  a  simple  transformation  of  the 
error  term  may  be  adequate. 

5.  If  sample  residuals  are  used  to  estimate  the  error 

variance,  c5  ,  this  estimate  will  be  too  small 
when  the  errors  are  positively  correlated. 

6.  Analyzing  first  differences  is  a  good  method  for 

economic  problems. 

Champernowne  (19^8)  showed  similar  results  by  theoretical  work 
with  this  model. 

Nov7  returning  to  the  original  model,  if  the  U.  ,  U^_i»«  •  •  j^t-D 
are  expressed  in  terms  of  the  x' s  and  Z's  using  (5)?  the  model 
(5'a)  becomes 

V^iVi-^'-'-'Vt-p 

=  Pl2it+-  •  •+Pq2qt-'^lPl2l,t-l-^-  •  •+=<pPp^q,t-p+ef 

An  investigation  of  the  efficiencies  and  estimated  variances 
of  least  squares  estimates  of  regression  coefficients  for  fixed 
Z'  s  and  tests  of  hypotheses  concerning  them  when  an  incorrect 
transforming  model  is  used  has  been  carried  out  by  Watson  (195'1). 
Various  types  of  general  solutions  are  presented;  bounds  on  the 
bias  of  the  estimated  variance,  lower  bound  on  the  efficiency  of 
the  estimates  of  regression  coefficients  and  some  bounds  on  the 
significance  points  of  the  t  and  F  tests.  Some  special  types  of 
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incorrect  transformations  are  also  discussed.  It  was  found  that 
for  what  appeared  to  be  only  mildly  inaccurate  estimates,  the  true 
probabilities  for  5%  significance  levels  may  be  considerably  dif- 
ferent. Watson  takes  a  rather  pessimistic  view  of  the  use  of 
transforming  devices  to  remove  the  effect  of  autocorrelation  in 
time  series  data. 

Application  of  least  squares  to  this  equation  will,  in  prin- 
ciple, lead  to  optimum  estimates  when  the  6-t  are  normally  distri- 
buted. These  equations  will  be  non-linear  and  hence,  difficult 
to  solve.  Some  sort  of  iterative  procedure  is  required.  Various 
methods  have  been  suggested  by  Champernowne  (19^8),  Cochrane  and 
Orcutt  (19^9),  Durbin  (I960)  and  others,  but  these  are  computa- 
tionally inefficient. 

Fuller  and  Martin  (I961)  have  suggested  the  simultaneous  es- 
timation of  both  the  error  sturcture  and  the  model  by  least 
squares.  This  method  appears  much  more  promising  for  practical 
work.  For  the  first  order  autoregressive  error  and  a  single  Z 
variable  this  method  is  easily  illustrated.  The  models  are 

and  V°<Ut-l=ef  (7) 

Substituting  (6)  and  (6)  lagged  into  (7)  gives 

x^=pZ^+c<pz^_^-coc^_^+6^.  (8) 

Estimation  of  c<  and  p  is  now  clearly  a  problem  in  non-linear  es- 
timation.  If  the  equation  is  re\^^ritten  as 

^=ei2t+®22t-i-^©3^-l-^€t 
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where  02=-©iO3,  the  problem  may  now  be  viewed  as  a  non-linear  re- 
striction upon  the  three  parameters.  Independence  of  the  errors 
will  now  be  a  special  case  with  oc=0. 

This  problem  can  now  be  handled  by  the  modified  Gauss-Newton 
iterative  procedure.  The  problem  becomes  one  of  regression  by 
expanding  (8)  in  a  Taylor's  series  about  a  point  Po=^^o»Po^»  where 
o(q  and  pQ  are  guessed  values  of  the  parameters.  If  only  the  first 
order  terms  are  considered,  then 

Xt-2^to=^2^+°^2t>l)Ap+(PoZt.l-xt_i)Ac< 

where  Ap=p-po  and  Aoc=c<-c<q,  The  corrections  in  the  trial  values, 
Ao(  and  Ap,  can  be  found  by  regressing  (xf-x^o)  on  i2^+^o^-l^  ^^^ 
and  (Po^t-l-^t^i). 

Hartley  (I96I)  has  sho^-m  that  the  residual  sum  of  squares 
decreases  in  the  Gauss  direction,  that  is,  that  some  k>0  exists 
such  that  the  residual  sum  of  squares  associated  with  Polr=('=<o''"^'=<» 
pQ+kAp)  is  less  than  the  residual  sum  of  squares  associated  with 
Pq.  It  may  happen  that  the  full  step  results  in  an  increase  in 
the  residual  sum  of  squares.  In  order  to  assure  a  decrease  in  the 
residual  it  is  necessary  to  compare  the  preceding  residual  with 
the  computed  residual  sum  of  squares  at  the  end  of  each  iteration. 
If  a  decrease  is  recorded,  (c<Q+Ao(,pQ+Ap)  are  used  as  start  values 
for  the  next  iteration.   If  a  decrease  is  not  recorded,  the  start 
values  are  taken  as  (c<o+l/2  Ac<,pQ+l/2  Ap)  and  the  residual  sum  of 
squares  computed.  If  a  decrease  is  not  noted  at  this  step,  the 
residual  sum  associated  with  (oCq+IA  Ac<,pQ+i/if  Ap)  is  foiind  and 
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so  on  until  the  decrease  occurs.  The  iteration  is  carried  on 
until  the  Ao(  and  Ap  satisfy  a  criterion  of  form 

(Ac<i)2 


var  ($j_) 


The  serious  problem  may  be  in  locating  an  initial  approxima- 
tion (°(q5Pq)  in  the  region  of  the  absolute  minimum  of  the  resid- 
ual. A  preliminary  grid  over  a  v;ide  range  for  <<  and  p  may  be  nec- 
essary to  find  a  sufficiently  close  approximation.  The  absolute 
minimum  and  not  a  local  minimum  must  be  found. 

If  the  Z^^  are  assumed  to  be  bounded  and  the  €4.  normally  dis- 
tributed the  final  set  of  estimates  are  maximum  likelihood  esti- 
mates possessing  the  properties  of  consistency  and  asymptotic  nor- 
mality. Large  sample  variances  and  covariances  are  estimated  in 
the  ordinary  manner  as  the  product  of  the  elements,  Cj_j,  of  the 
inverse  of  the  variance-covariance  matrix  at  the  final  iteration 
and  the  estimated  variance  s^.  The  variance  is  estimated  by 

n-1     ^2 

S   = 

n-r 

where  r  is  the  number  of  parameters  estimated. 

The  exact  nature  of  the  correlation  properties  is  of  course 
unknown.  A.  second  order  autoregressive  scheme 

V=<lUt-l+°^2Ut-2=€t 
could  be  assumed  and  the  parameters  estimated  in  a  similar  way. 
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Goodness-of-f it  tests  for  autoregressive  schemes  have  not  been  de- 
veloped to  a  sufficient  extent.   It  seems  though  that  for  most 
situations,  a  model  only  roughly  approximating  the  true  one  will 
give  the  desired  random  error  terra. 

PilBAiMETRIC  TIME  SERIES 

The  classical  model  which  has  been  used  widely  in  time-series 
analysis  consists  of  tv/o  parts,  a  systematic  part  Mi.,  and  a  random 
element  of  error  6^  with  mean  zero  and  variance  c^.  If  the  ob- 
served item  is  Xj.  (t=0,l,. . .  jn-l)  the  time  series  has  the  form  . 

The  stochastic  element  ^^  is  superimposed  on  the  non- stochastic 
part  Kj.  and  the  error  at  one  time  point  does  not  affect  a  later 
observation.  This  model  is  not  valid  if  the  error  elements  are 
autocorrelated . 

Different  methods  of  analysis  are  appropriate  for  different 
assumptions  about  the  nature  of  M^.  If  the  data  indicate  that  M^ 
is  a  "smooth"  function  of  time,  that  is,  K^   is  not  highly  irreg- 
ular or  periodic  in  form,  a  polynomial  may  be  used  to  locally  rep- 
resent the  data.  The  autocorrelations  of  M^  and  K^_^^   (h=l,2,..., 
n)  should  be  positive,  zero,  or  small  negative  numbers.  A  semi- 
empirical  procedure  known  as  the  variate  difference  method  is  com- 
monly used  to  estimate  the  degree  of  this  polynomial. 

VJhen  oscillatory  and  periodic  movements  are  present  in  the 
data  the  function  to  be  fitted  must  be  of  trigonometric  form. 
This  usually  involves  the  use  of  Fourier  analysis  or  some  related 
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procedure. 

Variate  Difference  Method 


\ 


Suppose  the  time  series  x^,  t=. . ,-l,0,+l,. , . ,  is  known  to  be 
of  the  form 

k    P 

Xt=p2^Ppt^  +6t=Mt+€t 

where  Po>Pi»'"jPk  ^^®  unknown  and  where  6^  is  a  random  element   ' 

p 
with  variance  6  , 

For  a  sample  (x]_,...,  Xjj)j  n>k+l,  minimum  variance  estimators 
for  the  p's  can  be  obtained  by  least  squares  and  the  variances  of 
the  estimators  calculated  by  usual  methods  if  the  parameter  k  is 
known.  However,  k  is  usually  unknown  and  must  also  be  estimated. 

A  polynomial  of  degree  p  has  the  well-known  property  that  its 
(p+l)th  finite  differences  vanish.  Tintner  (19^4-0)  has  used  this 
property  in  developing  the  variate  difference  method  for  estima- 
ting the  value  of  k  for  the  given  model. 

Let  y^  be  the  time  series  defined  by  the  hth  forward  differ- 
ence of  the  time  series  x^.  Since  A  is  a  linear  operator 

yh,t=^S='^X+^^€f 
By  the  advancing  difference  formula 


and 


^^et=€t+h-(l)^t+h-l+(2)€t+h-2-...+  (-l)^€t 


^VMt+h-(l)Mt-Hh-l+(2)  ^it+h-2----^(-l)\. 
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If  one  considers  the  sequence  of  samples  (x, ,.. .  ,xl_j_j^),  h=l, 
2,...  and  forms  the  ratios  ■'■'.- 


vj/h,5/[^(1i)]»  h=i,2,... 


=  2 

§=lL 


it  follows  that 

since  M|  is  a  nonrandom  function  of  t  and  6c  is  a  random  element. 
From  the  above  expression  for  A^'^e^  we  find 

The  first  term  is  alv/ays  non-negative  and  will  vanish  for  all 
h^k+1.  Therefore  E(Qj^)=o^  for  h^k+1. 

In  a  practical  situation,  the  question  then  becomes:  Which 
difference  series  sufficiently  explains  the  non-random  part  of  the 
time  series  so  that  all  difference  series  of  higher  order  are  es- 
timates  of  a     and  represent  e^  alone? 

Under  the  assumption  that  element  €-t  is  normally  distributed 
with  mean  zero  and  variance  o^,  a  large  sample  test  has  been  given 
by  0.  Anderson  (1929)  for  testing  the  hypothesis  that  the  vari-  • 
ance  of  the  difference  series  of  order  h  is  approximately  equal 
to  the  variance  of  difference  series  h+1,  i.e.  Qh=Qh+i'  ^or  a 
sample  (x-j_,...,  x^),  n»kQ  the  estimates  of  the  variances  of  the 
difference  series  are 


K5v<f 


^ 


Qh=^iiyh,^/[(n-h)(^)] 

If  the  systematic  part  of  M^  has  been  eliminated  in  the  finite 
difference  series  of  order  h^,  then  approximately 

Qho=Qho+l=QhQ+2="'  • 

In  order  to  test  the  approximate  equality  of  Qj^  and  Qh+i ,  the 
standard  error  of  Q^+i-Qh  ^^  computed: 

%.  ~  " — »  lc=0,l,2,.,,  , 
^hn 

Shn  ^as  been  tabulated  by  Tintner  (19^0).  An  asymptotic  formula 

can  be  used  for  large  values  of  n  and  h>6: 

2  _  (3h+l)Qgv/Sfh 

®h Q >  h=6,7,...  • 

2(2h+l)^(n-h-l) 

The  quantity 

Ru  =  ShlSkti  =  ShlSktl  Hv,   h-0  T  ? 
"     _       _i —  njjj^,  n-u,i,2,... 

®h       '^ih 
is  approximately  N(0,1)  for  large  samples  (0.  Anderson,  1929). 
Hence  if  an  h^  is  found  such  that  \   -1  is  significant  but  %  is 
not  significant  at  the  chosen  significance  level  it  is  assumed 
that  the  systematic  part  of  the  series  has  been  approximately 
eliminated  in  the  h^th  difference  series.  It  should  be  pointed 
out  that  the  choice  of  the  order  of  difference  ho  is  a  multiple 
choice  problem.  Therefore  the  maintaining  of  a  fixed  level  of 
significance  is  extremely  difficult. 
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2 
The  estimate  Qj^  of  a  is  not  completely  efficient  since  one 

o 

observation  is  lost  each  time  one  takes  a  higher  difference.  Morse 

and  Gruhbs  (19^7)  have  treated  this  problem  and  present  a  table 
for  evaluating  the  efficiency  for  various  n  and  h^.  The  higher 
the  order  of  the  series  of  differences  from  which  the  variance 
has  been  estimated  the  less  efficient  is  the  estimate. 

Applicability  of  the  variate  difference  method  is  thought  to 
be  limited  even  by  Tintner  whose  work  with  this  method  has  been   ■ 
extensive.  This  method  is  not  valid  when  the  errors  are  autocor- 
related  and  an  autoregressive  scheme  should  be  used  under  such 
circumstances. 

Oscillatory  and  Periodic  Movements 

In  some  types  of  data  a  distinct  oscillatory  movement  may  be 
apparent.  Suppose  it  is  Imown  that  such  a  time  series  x^^,  t=..., 
-1,0,+1,,..,  has  the  periodic  parametric  form 

x.=A  +  2  [a  cos  u>^t+B„sin  e<;„tl+64. 

where  6^  is  a  random  element  v^ith  mean  zero  and  variance  a  ,  A-, 
A.  )  B  ,  and  tj  are  knov/n  real  constants  and  O^u;  <Tr,  For  sim- 
plicity, it  is  assumed  for  the  moment  that  A  is  zero. 

Consider  a  sample  (xq,...,  x^_3_)  from  the  time  series.   If 
one  multiplies  through  by  cos  tct,  Oi:a;^Tr,  sums  over  t  and  divides 

by  n  then 

T  n-1 

<^(<o)=:=.      2   Xf-COSCJt 

n  t=0  ^ 
1  n-1  r  k 
^  t^O  lpii^S^°^  "^P*  ^°^  tt;t+BpSin  oipt  cos  t4;t)+6tcos  a/t 


The  expression  can  now  be  rewritten 

k  PA     n-1 


<Kito) 


1      k  PA     n-; 
=i     2     -^     2     cos(tc>  +co')t+cos(«^^-"^)t 
^  p=l  ^    t=0  P  P 


B„  n-1 
+-^     2     sin(tc'„+a;)t+sin(to'_-6t;)t- 
'^^   t=0  P  P 


1  n-1 
+—    2  €tcos  cut, 
n  t=0^ 

Now  as  n — >  oo,  c<(«;) — >Ap/2,  tJ=a^p,  p=l,...,k 

Similarly,  if  the  original  expression  is  multiplied  through 
by  sinct^t  and  denoted  by  =<'(«-'),  and  the  corresponding  operations 
carried  out,  as  n — >  cd  , 

c<«(^)__<^-Bp/2,a;=u;p,  p=l,...,k 

Therefore,  if  2Tr /aJ  ^   is  a  genuine  period  of  the  time  series  x^, 

t=,..,-l,0,+l,...,  1  2  X{.cosaX.t  will  tend  to  be  near  k^/2   and 
n-1  n  0=0  ^     .f  p 

1  2  Xf  sin  6<^  t  will  tend  to  be  near  -B„/2. 
n  t=0  ''     P  P' 

Let  n  in  the  sample  (x-j_,...,  x^)  be  odd,  say  2r+l,  and  let 

2Trp 


The  form  of  the  periodic  function  is  now 

Apcos  pO^+BpSin  pOtj+^t 


k 

X.  =A  +  2 
t  o  p=i 


where  ©^=2xrt/2r+l. 
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If  the  periods  of  the  time  series  are  knovm  but  the  A  ,  A.  , 
B^,  p=l,,..,k  are  iinJinown,  the  mininram  variance  estimators  for 
these  quantities  can  be  found  by  least  squares.  These  least 
squares  equations  are 


n-lr 
t=0 


x-j^-Aq-  2  (ApCOs  pO^+BpSin  pO^) 


=0 


and 


n-1 

2 
t=0 


^-•^o"  5.  ^\°°s  pO^+BpSin  pQ^) 


p=l 


sin  hOt 
COS  jOt 


=0, 


gives 


nj  j—xj«««}K* 
The  standard  formula  for  the  sum  of  a  cosine  progression 

n-1 
2  cos  mt^sin(l/2)mt;25  cos(l/2)m(t-l)^/sin(l/2)m;z{ 


where  jzJ=04./t.  For  integral  values  of  m  this  will  vanish  since 
(l/2)mt;^mjr.  Therefore  all  sums  of  the  form 


Scos  hO^  cos  j 0^=1/2  Scos(h+j )0^+2cos(h- J )0t 

t  Lt  t 

will  vanish  unless  h=j,  when  the  value  will  be  (l/2)n. 

The  first  of  the  least  squares  equations  gives  the  result 
n-1 

From  the  normal  equations,  the  covarlance  matrix  is  found  to 
be 
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n/2   0 
0  n/2 


0 

0 


•      • 


.  11/2 


and  its  inverse  is 


2/n   0 
0  2/n 


.   0 


0   0   .   .   .2/n 
It  follows  that  coefficients  are  random  variables  with  zero 
covariances  and  variances  2cj^/2r+l.  The  variance  of  a  fitted 

value  is  (2k+l)o^/n  and  is  independent  of  the  angle  0^,     The  re- 

n-1      p 
sidual  sum  of  squares  S  (x^-x^)^  is  given  by 

n-1     p   J"  k  .  p  ?' 
Jo^t-^ag.l/2^np|^(a|-Hb|)|. 

The  expectation  of  this  is  (n-2k-l)o^  and  so  the  variance  of  an 

observation  is  estimated  by 

n-1      2 


s^= 


n-2k-l 

If  the  time  series  has  periodic  form  but  the  A^,  Bp,  and  o^p 
and  even  k  are  unknovm,  a  method  of  searching  for  suspected  periods 
is  necessary.  The  behavior  of  the  mean  values  of  ©((«;)  and  c<«  (cu) 
described  earlier  suggests  that  these  values  considered  as  func- 
tions of  u)    might  be  useful  in  screening  out  true  periods  if  any 
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exist. 

As  early  as  I898,  Schuster  proposed  a  method  of  searching 
for  possible  periods.  Walker  in  191^,  and  Fisher  in  1929,  fol- 
lowed with  methods  of  testing  the  significance  of  suspected  pe- 
riods. This  method^  generally  known  as  periodogram  analysis, 
tests  the  significance  of  possible  periods  imder  the  assumption 
that  X{.  is  a  white  noise. 

If  it  is  assumed  that  there  are  no  true  periods  at  all  in 
the  given  time  series,  then  the  A  and  B  are  all  zero.  But  if 
27r/a^p  is  a  true  period,  the  behavior  of  '=<(**^p)  and  c<«(a;p)  indi- 
cate that  both  of  these  will  tend  to  have  values  away  from  zero 

for  large  n.  The  Quantity  or(«J  )+o(«  ( a;  )  has  a  value  for  each 

P      P 

'C  ,  p=l,.,.,k,  so  one  needs  a  way  of  testing  whether  the  largest 
(or  mth  largest)  of  these  quantities  is  significantly  large  under 
the  assumption  that  Xi.  is  a  white  noise. 


The  further  assumption  that  x^  is  normal  white  noise  with 

p 

mce  a     is  made  in  dc 

testing.  The  quantities 


p 

variance  a     is  made  in  dealing  with  the  problem  of  significance 


2(2r-H)  °c(^p).  /2(2r+l)c<«(a;^),  p=l,...,k 


are  2k  independent  random  variables  distributed  N(0,1),   If 

,  p=l,. . . ,k, 


u„=2kHi 


c<2(a.p)+c<«2(^p) 


u-j_5...,  Ujj.  are  chi  square  variables  with  2  degrees  of  freedom, 
and  have  probability  element 

e-(u-L+. .  ,+Ujj.)dU2^. .  .du^ 
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i'or  u^O,  p=l,...5k5  and  0  otherwise.  The  problem  of  whether 

2        2 
the  largest  (or  mth  largest)  of  the  quantities  <=(  {uJr})+cii    (uj^) 

is  significantly  large  reduces  to  testing  whether  the  largest  of 

u-^,...,  u^.  is  significantly  large.  The  test  which  suggests  it- 

2 
self 5  since  c  is  unknovrn,  is  v/hether  the  largest  (or  mth  largest) 

of  the  ratios 

.    g  =  2 ,  TD=l,...5k 


J. 


is  significantly  large. 

V/alker's  (191^-0  criterion  was  that  the  chance  for  the  larg- 
est intensity  to  exceed  a  given  level  x  is  given  by  l-d-e"^''^  )  . 
Fisher  (1929)  found  the  distribution  function  of  g: 

P(g>g')=  L(-l)^(A)[l-(^+l)g'l^"-^ 
p-o      *    -        J 

where  r  is  the  largest  integer  ^k-1  for  which  l-(r+l)g' ^  0,  For 
a  given  k  and  a  given  <k  the  value  of  g^  for  v/hich  P(g  >  g^)=<=< 
would  be  the  critical  value  of  g  for  significance  level  100c<  %, 
Tabulations  of  g^  have  been  made  by  Davis  (19^1)  for  a  wide  range 
of  values  of  g^  and  k. 

Similarly  if  g  is  defined  as  the  mth  largest  of  U]_,.. . ,  uj, 
and  divided  by  u-j_-i-.  ..+Ujj.,  then 

(m-l)l  p=m  pCk-p)l(p-m)l 
where  r  is  the  largest  integer  for  which  1-rg' ^  0. 

Hartley  (19^9)  proposed  a  method  for  testing  the  significance 
of  periods  using  the  ?  ratio.  The  observed  intensities  S=aS+b| 
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are  computed  and  the  significance  of  the  largest  intensity  is 
tested. 

Hartley  starts  v/ith  the  hypothesis  of  a  completely  random 

2 

series  where  the  x^  are  normal  deviates  with  variance  a  .  The  p 


ns2 
.  P 


,  p=l,...,k,  are  all  independent  X.  variates, 


intensities  1/2 

each  with  2  degrees,  of  freedom.  V/alker's  criterion  Is  converted 

into  an  exact  test  by  making  use  of  the  residual  as  an  indepen- 

p 
dent  estimate  of  a  .  The  test  that  results  is  one  for  the  maximum 

variance  ratio 


max  ' 


nsL^(n-2k-l)/K^ 
max       ' 


The  probability  for  F^jax^^*  ^^  gi.VQn  by 

P(F*)^  ^j,(s)  (1-exp  [-s%*  J)^ds 


where  ^y(s)  denotes  the  distribution  of  a  sample  standard  devia- 
tion based  on  v  degrees  of  freedom.  For  this  problem, y=n-2k-l. 
Hartley  uses  an  approximation  to  the  integral  valid  only  for  upper 
percentage  points.   Instead  of  evaluating  the  upper  100c<  %  point 
of  the  distribution,  the  100c(/k  *%  point  of  the  F  distribution 
based  on  2  and  y  degrees  of  freedom. 

If  the  series  x^  is  of  the  periodic  form,  then  of  the  k  pe- 
riods examined,  some,  say  h,  have  positive  amplitudes  and  the  re- 
maining k-h  have  zero  amplitudes.  This  says  that  A^+B^>0  for  h 

Ir  XT 

values  of  p,  and  A^+B^=0  for  k-h  values  of  p.   If  the  maximum  ob- 

served  intensity  is  judged  significant  by  the  test,  the  conclusion 

2  2 
that  Ap+Bp  >  0  for  that  particular  p  for  which  the  maximum  intensity 
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was  observed  is  not  strictly  valid.  Only  rejection  of  the  hypoth- 
esis of  randomness  is  justified.  In  practice,  however,  it  is  usu- 
ally desired  to  conclude  that  the  maximum  intensity  observed  indi- 
cates that  for  the  particular  p  the  a|+b|  >  0, 

To  investigate  the  extent  to  which  the  experimenter  might  be 
misled  by  the  significant  test,  it  is  necessary  to  investigate  the 
power  of  the  test.  The  chance  of  reaching  a  significant  result  is 
the  sum  of  two  situations: 

1.  The  observed  maximum  intensity  l/2|nS^j  does  come 

from  the  set  of  h  positive  intensities,  and 

2,  The  observed  maximum  intensity  does  come  from 

the  set  of  k-h  true  zero  intensities, 

A  wrong  conclusion  v/oxild  be  reached  if  the  second  situation 
occurs.   It  has  been  shovm  by  Hartley  (19^9)  that  this  chance  is 
smaller  than  (k-h) A  times  the  error  of  the  first  kind,  and  hence, 
if  the  independent  harmonic  intensities  are  used  in  the  F^...  test, 
the  chance  of  reaching  a  wrong  conclusion  is  almost  negligible. 

In  order  to  examine  the  F    test  iinder  the  general  hypoth- 
esis 

%-V^"^^t'  t=0,l,...,n-l, 

where  the  ^^  are  random  normal  deviates,  the  x+  are  represented 

by  their  complete,  finite  Fourier  expansion  with  n  assumed  odd  for 

convenience.  The  general  hypothesis  can  now  be  written 

1/2 (n-1) 
l/2(n-l)%'Xt=V   pii   (Apcos  pO^+BpSin  pQt)+6t. 

This  differs  from  the  previously  stated  ^^   in  that  the  represen- 
tation of  Xj.  includes  (n-l)-k  additional  real  Fourier  terms.  The 
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magnitude  of  these  additional  terms  as  a  percentage  of  the  vari- 
ance of  the  e^  can  be  expressed  by  a  non-centrality  ratio  (Hart- 
ley, 19^9) 

,        l/2(n-l)  poo 
S  =  (l/2)n    ^1^  (A;-fB|)/<72. 

If  the  series  of  x-j.  are  the  ordinates  of  a  smooth  function,  then 
from  standard  Fourier  theory  it  is  laiovm  that  for  a  sufficiently 
large  k,  S  can  be  made  as  snail  as  is  required  for  any  n>2k.  In 
practice  then,  if  periods  up  to  order  m  are  suspected,  the  F„,^ 
test  will  detect  only  these  if  n  »  2m.  If  too  small  values  of  m 
and  n  are  used,  the  Fjr^^  test,  which  is  based  on  the  assumption 
that  S  is  zero,  will  be  biased  by  an  amount  depending  on  the 
value  of  o.  This  effect  can  be  calculated  exactly  using  the 
methods  of  Hartley.  It  is  stressed  that  the  Fj^a^^  test  is  inap- 
propriate unless  the  x^  can  be  represented  by  a  moderate  number 
of  Fourier  terms  and  yet  S  will  be  expected  to  be  zero  or  small. 

The  residuals  from  the  final  fitted  curve  can  be  examined  for 
autocorrelation  to  see  if  the  periodic  behavior  has  been  adequate- 
ly described.  Use  of  the  circular  coefficient  of  autocorrelation 
with  lag  L  is  appropriate  in  this  case.  For  lag  1,  Anderson  and 
ilnderson  (1950)  have  calculated  tables  of  significance  points  of 
R  v/here 

^2^(xt-yit)(xt-l-Mt-l) 
R _ ,  xo=Xn 

is  the  circular  autocorrelation  coefficient  used  for  residuals 
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from  a  Fourier  series.  Since  economic  data  were  the  primary  moti- 
vation for  the  work,  the  basic  periods  for  v;hich  the  R  .  values 
were  calculated  are  for  p=2,  3,  ^-,  6  and  12,  indicating  the  usual 
yearly  increments  for  which  economic  data  are  tabulated.  Signif- 
icance levels  are  for  cc=,oi  and  c<=,05  and  N  ranges  from  6  to  the 
point  in  each  distribution  where  tables  for  the  regular  coeffi- 
cient of  correlation  or  of  the  incomplete  beta  function  give  sat- 
isfactory approximations. 

Using  the  same  data  for  which  the  correlogram  was  plotted  in 
Fig,  2,  harmonic  analysis  and  Hartley's  method  can  be  used  to  test 
for  periods  in  the  data.  If  it  is  assumed  that  the  periods  will 
be  no  shorter  than  one  month  in  length,  then  an  upper  limit  of  12 
can  be  used  in  searching  for  periods  in  the  data.  The  results  of 
such  a  harmonic  analysis  are  given  in  Table  2, 

The  Hartley  test  at  the  1%  level  of  significance  gives  peri- 
ods 1,  2,  3,  and  k  as  significant.  Therefore  the  resulting  model 
is 

The  coefficient  of  determination  resulting  from  this  model  is 
.6353  and  multiple  R  will  be  .7971.  The  standard  deviation  of  an 
observation  is  .0^25. 

To  see  whether  the  systematic  portion  of  the  variation  has  ■ 
been  sufficiently  explained,  the  autocorrelation  of  the  residuals 
is  examined.  Table  3  gives  the  coefficients  of  autocorrelation 
for  the  residuals  from  the  regression  line.  None  of  the  coeffi- 
cients calculated  are  near  to  exceeding  the  significance  level 
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for  a  reasonable  <=(,  It  seems  valid  to  conclude  then  that  the 
data  are  periodic  in  natiire  and  that  the  fitted  model  gives  good 
estimates  of  the  true  parameters. 


Table  2.  Harmonic  Analysis  of  Probabilities  of 
Dry  Days  for  Manhattan,  Kansas. 


ao  =  .  8^9^+5^ 


Intensity 


F  Ratio 


1 

.073138 

2 

.005515 

I 

-.00801^1- 

.005011 

5 

-.000671 

6 

-.002125 

I 

-.002107 

-.002280 

9 

.ooi+523 

10 

-.006^J+8 

11 

-.000200 

12 

-.00^532 

.009187 

.005^33 
.000355+ 

279.53* 

.017995 

18.22* 

.012171 

.000121 

10.93* 

.010217 

.000129 

6.66* 

.001261 

.000002 

0.10 

.006155 

.0000if2 

2.18 

.001613 

.000007 

0.36 

.0035^+0 

.000018 

0.91 

.002^80 

.000027 

1.37 

.00^20 

.000061 

3.1^ 

.002886 

.000008 

0.1+3 

.005387 

.000050 

2.55 

j!L_signif  icant  at  l7o  level 


Lag 


Table  3,  Coefficients  of  Autocorrelation  of  Residuals 
from  Regression  Model  with  p=^. 


Rt 


Lag 


R, 


1 

.0979 

8 

2 

-.05^^ 

9 

I 

-.0771 

10 

.052^ 

11 

5 

-.000^- 

12 

6 

-.0190 

13 

7 

-.1091 

Ih 

..0602 
•.0785 

.01^+6 
-.0^82 

.03^5 

.0167 
.0183 


Lag 


16 


19 
20 


R, 


..0169 
•.  091+9 
..0216 
-.07^6 
•  .0690 
..0038 
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SU1#IARY  km   CONCLUSIONS 

Problems  involved  in  the  analysis  of  time  series  data  have 
concerned  statisticians  since  statistics  emerged  as  a  separate 
discipline.  Most  of  the  well-kno^vn  statisticians  have,  at  one 
time  or  another,  made  some  contribution  to  the  theory  of  time 
series  analysis. 

The  first  area  of  the  problem  that  was  attacked  was  that  of 
testing  for  the  existence  of  autocorrelation,  and  considerable 
progress  has  been  made.  Less  v;ell  developed  are  the  areas  of  es- 
timation and  hypothesis  testing.   It  has  been  illustrated  here 
that  significant  contributions  are  still  being  made  and  much  re- 
mains to  be  done  in  these  areas.  Small  sample  theory  is  extremely 
vague  and  efficient  goodness-of-f it  tests  are  practically  non- 
existent. 

V/ere  it  not  for  the  fact  that  most  time  series,  particularly 
those  in  economics,  are  relatively  short,  the  non-independence  of 
the  €'s  would  pose  a  much  less  serious  problem.  The  theory  of 
least  squares  estimation  should  be  used  wherever  applicable.   It 
is  possible  that  fitting  the  ordinary  least  squares  line  is  the 
best  starting  point  in  the  analysis  of  a  time  series. 

The  methods  presented  here  are  generally  amenable  to  pro- 
gramming for  computers. 
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In  the  sciences  many  problems  occur  in  which  a  process  pro- 
duces what  may  be  considered  a  family  of  random  variables  such 
that  there  is  a  value  of  X^  for  each  value  of  t  in  some  interval 
T,  The  experimenter  usually  wishes  to  investigate  the  nature  of 
the  response  curve  over  the  interval  T. 

The  usual  regression  model  in  least  squares  analysis  is 

r 
VPo-'illPi^it+^t ,  t=0,l, . . .  ,n-l, 

where  the  Z' s  are  assumed  fixed  in  repeated  sampling  and  the  ^'s 
are  independently  distributed  with  mean  zero  and  variance  o^.  In 
applying  traditional  least  squares  methods  to  time  series  data 
difficulties  may  arise  because  successive  observations  often  lack 
the  property  of  independence.  If  the  6's  are  not  independent  the 
assumptions  necessary  for  using  ordinary  least  squares  estimation 
theory  are  violated, 

A  concern  for  problems  involved  in  analyzing  time  series  data 
was  lacking  among  research  workers  until  the  middle  1920' s.   In 
the  1930's  Koopmans,  Wold,  and  others  clarified  the  sampling  sig- 
nificance of  regression  analysis  of  time  series  data.  Consider- 
able work  followed  on  the  problem  of  testing  for  the  existence  of 
correlation  of  the  errors  but  all  too  little  on  the  more  important 
problem  of  the  best  estimation  procedure  when  correlations  do  ex- 
ist. Results  are  still  somewhat  lacking  in  this  area,  but  several 
estimation  procedures  have  been  proposed  since  1950.  This  paper 
deals  with  the  methods  generally  used  in  the  social  and  biological 
sciences. 


The  discussion  of  time  series  is  usually  confined  to  what  are 
called  stationary  time  series.  Roughly,  this  means  that  the  time 
series  is  without  trends,  not  only  in  the  mean  values  of  the  X^ 
but  also  in  their  variances. 

Three  models  are  generally  used  in  analyzing  time  series  data. 
If  it  is  assumed  that  the  data  follow  an  underlying  systematic 
scheme  with  random  fluctuations  superimposed,  the  methods  of  har- 
monic analysis  and  periodogram  analysis  may  be  used  to  determine 
the  nature  of  a  systematic  component  with  regular  periods.  For 
a  systematic  component  with  an  irregular  oscillatory  movement  the 
variables  2^^^  in  the  regular  regression  model  may  be  replaced  by 
t^'s  and  a  polynomial  form  used  to  locally  describe  the  function. 
In  other  cases,  the  assumed  model  may  involve  lagged  values  of  X 
as  predictors  or  possibly  both  lagged  X's  and  Z's.  This  is  re- 
ferred to  as  an  autoregressive  model. 

Much  remains  to  be  done  on  the  time  series  problem  in  the 
areas  of  estimation  and  hypothesis  testing.  Small  sample  theory 
is  extremely  vague  and  efficient  goodness  of  fit  tests  are  prac- 
tically non-existent. 


